๐Ÿ“ฆ sahkal / sorting-algo

๐Ÿ“„ 005_QuickSort - Walkthough.ipynb ยท 285 lines
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# QuickSort\n",
    "\n",
    "Like MergeSort, QuickSort is a divide-and-conquer algorithm. We need to pick a pivot, then sort both sublists that are created on either side of the pivot. Similar to the video, we'll follow the convention of picking the last element as the pivot.\n",
    "\n",
    "Start with our unordered list of items:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "items = [8, 3, 1, 7, 0, 10, 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's sketch out what a first iteration would look like.\n",
    "\n",
    "We can use `len` to grab the pivot value, but in order to sort in-place we'll also want the index of the pivot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "pivot_index = len(items) - 1\n",
    "pivot_value = items[pivot_index]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because we plan on sorting in-place, we want to iterate through the items to the left of our pivot (`left_items`). When they're larger than `pivot_value` though, we will not increment our position through `left_items`, but instead change `pivot_index`. We'll know we're done with this pass when `pivot_index` and `left_items` index are equal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 7, 3, 10, 8]\n"
     ]
    }
   ],
   "source": [
    "left_index = 0\n",
    "\n",
    "while (pivot_index != left_index):\n",
    "    \n",
    "    item = items[left_index]\n",
    "    \n",
    "    if item <= pivot_value:\n",
    "        left_index += 1\n",
    "        continue\n",
    "    \n",
    "    # Place the item before the pivot at left_index\n",
    "    items[left_index] = items[pivot_index - 1]\n",
    "    # Shift pivot one to the left\n",
    "    items[pivot_index - 1] = pivot_value\n",
    "    # Place item at pivot's previous location\n",
    "    items[pivot_index] = item\n",
    "    # Update pivot_index\n",
    "    pivot_index -= 1\n",
    "\n",
    "print(items)\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should see:\n",
    "\n",
    "```\n",
    "[0, 1, 2, 7, 3, 10, 8]\n",
    "```\n",
    "\n",
    "\n",
    "When our loop terminated, we knew everything to the left of our pivot was less than pivot, and everything to the right of our pivot was greater than pivot. Great! Now we need to do that again for the sublists that are left and right of pivot's final location.\n",
    "\n",
    "We can do that by abstracting our above code to a function, just passing the list of items as a parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 7, 3, 10, 8]\n"
     ]
    }
   ],
   "source": [
    "def sort_a_little_bit(items):\n",
    "    left_index = 0\n",
    "    pivot_index = len(items) - 1\n",
    "    pivot_value = items[pivot_index]\n",
    "\n",
    "    while (pivot_index != left_index):\n",
    "\n",
    "        item = items[left_index]\n",
    "\n",
    "        if item <= pivot_value:\n",
    "            left_index += 1\n",
    "            continue\n",
    "\n",
    "        items[left_index] = items[pivot_index - 1]\n",
    "        items[pivot_index - 1] = pivot_value\n",
    "        items[pivot_index] = item\n",
    "        pivot_index -= 1\n",
    "        \n",
    "items = [8, 3, 1, 7, 0, 10, 2]\n",
    "sort_a_little_bit(items)\n",
    "print(items)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now what would it require to recurse on this? We want to take the result of that iteration and act on it. So first off, we see that in order to call the function again, we need to communicate the final `pivot_index` value. And then with that, we can mark off segments of the list and have our function operate on less than the entire list. So let's change our function to accept the indices it should stay within, and return the pivot_index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 7, 3, 10, 8]\n",
      "pivot index 2\n"
     ]
    }
   ],
   "source": [
    "def sort_a_little_bit(items, begin_index, end_index):    \n",
    "    left_index = begin_index\n",
    "    pivot_index = end_index\n",
    "    pivot_value = items[pivot_index]\n",
    "\n",
    "    while (pivot_index != left_index):\n",
    "\n",
    "        item = items[left_index]\n",
    "\n",
    "        if item <= pivot_value:\n",
    "            left_index += 1\n",
    "            continue\n",
    "\n",
    "        items[left_index] = items[pivot_index - 1]\n",
    "        items[pivot_index - 1] = pivot_value\n",
    "        items[pivot_index] = item\n",
    "        pivot_index -= 1\n",
    "    \n",
    "    return pivot_index\n",
    "\n",
    "items = [8, 3, 1, 7, 0, 10, 2]\n",
    "pivot_index = sort_a_little_bit(items, 0, len(items) - 1)\n",
    "print(items)\n",
    "print('pivot index %d' % pivot_index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Almost there! Let's create another function, the recursive function we want, that uses this. And then we'll have our top level definition of `quicksort` call it with our initial parameters. But we need a way to know if we're done! We'll use the indices to see if they demark a list of more than one item. If the demarked sublist is 0 or 1 item, we know it's already sorted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 3, 7, 8, 10]\n"
     ]
    }
   ],
   "source": [
    "def sort_all(items, begin_index, end_index):\n",
    "    if end_index <= begin_index:\n",
    "        return\n",
    "    \n",
    "    pivot_index = sort_a_little_bit(items, begin_index, end_index)\n",
    "    sort_all(items, begin_index, pivot_index - 1)\n",
    "    sort_all(items, pivot_index + 1, end_index)\n",
    "    \n",
    "def quicksort(items):\n",
    "    sort_all(items, 0, len(items) - 1)\n",
    "    \n",
    "items = [8, 3, 1, 7, 0, 10, 2]\n",
    "quicksort(items)\n",
    "print(items)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's a good idea to test a few more scenarios. Does it work with an even number of items? What if they're already sorted?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1]\n",
      "[96, 97, 98]\n"
     ]
    }
   ],
   "source": [
    "items = [1, 0]\n",
    "quicksort(items)\n",
    "print(items)\n",
    "\n",
    "items = [96, 97, 98]\n",
    "quicksort(items)\n",
    "print(items)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Mission Accomplished!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}